Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autocomplete milestone #526

Merged
merged 23 commits into from
Apr 29, 2016
Merged

autocomplete milestone #526

merged 23 commits into from
Apr 29, 2016

Conversation

missinglink
Copy link
Member

@missinglink missinglink commented Apr 25, 2016

This PR contains all the work that's gone in to the Autocomplete Improvements Milestone, it is paired with a corresponding schema update.

The PR is focused around fixing bugs and flaky behavior in /v1/autocomplete that's been reported by our users, I've also taken the opportunity to triage all the incoming requests for autocomplete, clean up some messy parts and add more code coverage.

The major difference in behavior is due to how we handle analysis and especially synonym substitution for autocomplete, an example of a query which was not previously possible using phrase matching:

improved analysis

/v1/autocomplete?text=vic uni wellington

1)  Victoria University of Wellington, New Zealand
...

more information about the analysis changes can be found in: pelias/schema#105

final token is a stopword bug

This reported bug has been squashed: pelias/pelias#211

/v1/autocomplete?text=green lane

 1) Green Lane, PA, USA
 2) Green Lane Farms, Lower Allen, PA, USA
...

improved server-side tokenizer

Improved parsing of source data containing commas or slashes as delimiters, such as in Bedell Street/133rd Avenue. see: pelias/schema#113

improved client-side tokenizer

In order to generated the best/most efficient queries possible, we've added a client-size tokenizer which mimics the functionality of the server-side tokenizer. see: #529

improved handling of ordinals

This allows users to partially type an address such as "33rd street" as "33r", see pelias/schema#96

improved admin matching, particularly for non-delimited queries

this PR also includes a fix for parent.borough not being targeted during admin patching. see #527

/v1/autocomplete?text=200 dean st brooklyn

 1) 200 Dean Street, Brooklyn, New York, NY, USA

indexing of single digit numbers

in the past we did not index single digit numbers, they are now present in the index without causing a major increase in disk/ram usage.

/v1/autocomplete?text=grolmanstr 8

1)  Grolmanstr. 8, Köln, Germany
...

tighter 'focus' clustering and improved localized matching

see: http://missinglink.embed.s3.amazonaws.com/pelias_clustering.png

house number can be provided in 1,2 or 3rd position

mainly for for Germanic addresses schemes. see #489

/v1/autocomplete?text=wilmersdorfer str 51

1)  Wilmersdorfer Straße 51, Kiel, Germany
...

boosting of exact matches

an additional query segment was added to ensure that outputs that more closely match the input text appear higher than those which only partially match.

@@ -82,6 +82,10 @@ module.exports = _.merge({}, peliasQuery.defaults, {
'admin:neighbourhood:field': 'parent.neighbourhood',
'admin:neighbourhood:boost': 200,

'admin:borough:analyzer': 'peliasAdmin',
'admin:borough:field': 'parent.borough',
'admin:borough:boost': 800,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why so high?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply copied the value from 'country', I will drop it down to a value for 'region' instead if you feel it's too high. What would you suggest as a good value here?

@orangejulius
Copy link
Member

orangejulius commented Apr 26, 2016

We noticed a regression from pelias/pelias#164. When searching for "New York, NY" from Stamford, CT, autocomplete for "New York, NY" used to show the city of NYC first, now it doesn't

Autocomplete for New York, NY

There's now an acceptance test for this: pelias/acceptance-tests#232

@missinglink missinglink added this to the Autocomplete Improvements milestone Apr 29, 2016
@orangejulius
Copy link
Member

👍

@orangejulius orangejulius merged commit e04efed into master Apr 29, 2016
@orangejulius orangejulius deleted the missinglink branch May 25, 2016 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants